Active Learning with Distributional Estimates

نویسندگان

  • Jens Röder
  • Boaz Nadler
  • Kevin Kunzmann
  • Fred A. Hamprecht
چکیده

Active Learning (AL) is increasingly important in a broad range of applications. Two main AL principles to obtain accurate classification with few labeled data are refinement of the current decision boundary and exploration of poorly sampled regions. In this paper we derive a novel AL scheme that balances these two principles in a natural way. In contrast to many AL strategies, which are based on an estimated class conditional probability p̂(y|x), a key component of our approach is to view this quantity as a random variable, hence explicitly considering the uncertainty in its estimated value. Our main contribution is a novel mathematical framework for uncertainty-based AL, and a corresponding AL scheme, where the uncertainty in p̂(y|x) is modeled by a second-order distribution. On the practical side, we show how to approximate such second-order distributions for kernel density classification. Finally, we find that over a large number of UCI, USPS and Caltech-4 datasets, our AL scheme achieves significantly better learning curves than popular AL methods such as uncertainty sampling and error reduction sampling, when all use the same kernel density classifier.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Adaptive Strategy for Active Learning with Smooth Decision Boundary

We present the first adaptive strategy for active learning in the setting of classification with smooth decision boundary. The problem of adaptivity (to unknown distributional parameters) has remained opened since the seminal work of Castro and Nowak (2007), which first established (active learning) rates for this setting. While some recent advances on this problem establish adaptive rates in t...

متن کامل

Heuristics in exploration: Distributional information is selectively used for active learning

Everyday decision-making is filled with choices about what to act on, with outcomes playing a critical role in learning. Information gain is oft cited as a valuable approach to maximize potential learning, but its computation is costly. It entails evaluating the probability of multiple outcomes given any possible action, and then considering the degree of belief-change over all possibilities. G...

متن کامل

Distributional Term Set Expansion

This paper is a short empirical study of the performance of centrality and classification based iterative term set expansion methods for distributional semantic models. Iterative term set expansion is an interactive process using distributional semantics models where a user labels terms as belonging to some sought after term set, and a system uses this labeling to supply the user with new, cand...

متن کامل

Improving English Named Entity Recognition

In recent years much of the work in named entity recognition has been focused on tackling entities in different languages or domains. However the task of English named entity recognition still remains to be solved. In this paper we explore more ways to improve the English named entity recognition system beyond just distributional semantics and the use of external gazetteer. These ways include p...

متن کامل

Fast phonetic learning occurs already in 2-to-3-month old infants: an ERP study

An important mechanism for learning speech sounds in the first year of life is "distributional learning," i.e., learning by simply listening to the frequency distributions of the speech sounds in the environment. In the lab, fast distributional learning has been reported for infants in the second half of the first year; the present study examined whether it can also be demonstrated at a much yo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012